Author: Nicolas Legrand mailto:nicolas.legrand@cfin.au.dk
# Execute this cell to install systole when running an interactive session
#%%capture
#! pip install git+https://github.com/embodied-computation-group/systole.git@dev
import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from systole.detection import oxi_peaks, interpolate_clipping, ecg_peaks
from systole.plotly import plot_raw, plot_frequency, plot_subspaces, plot_timedomain, plot_nonlinear
from systole import import_dataset1, import_ppg
from systole.utils import heart_rate, to_epochs
from systole.hrv import frequency_domain
from IPython.display import Image
from IPython.core.display import HTML
sns.set_context('talk')
Introduction to cardiac signal analysis for cognitive science¶
Electrocardiography (ECG)¶
Image(url='https://github.com/embodied-computation-group/systole/raw/dev/notebooks/images/ecg.png', width=800)

First, we load a dataset containing ECG and respiratory data. These data were acquired using the paradigm described in Legrand et al. (2020). ECG, EDA and Respiration were acquired while the participant watched neutral and disgusting images. Here, we only need ECG respiration and the stim channel (encoding the presentation of the images), we then only provide these arguments in the modalities parameter to speed up download and save memory.
ecg_df = import_dataset1(modalities=['ECG', 'Respiration', 'Stim'])
0%| | 0/3 [00:00<?, ?it/s]
Downloading ECG channel: 0%| | 0/3 [00:00<?, ?it/s]
Downloading ECG channel: 33%|█████████████████████████████████████████████████████ | 1/3 [00:03<00:07, 3.66s/it]
Downloading Respiration channel: 33%|██████████████████████████████████████████████████▎ | 1/3 [00:03<00:07, 3.66s/it]
Downloading Respiration channel: 67%|████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 2/3 [00:07<00:03, 3.65s/it]
Downloading Stim channel: 67%|█████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 2/3 [00:07<00:03, 3.65s/it]
Downloading Stim channel: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:10<00:00, 3.64s/it]
Downloading Stim channel: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:10<00:00, 3.64s/it]
R peaks detection¶
Here, we will use the ecg_peaks function to perform R peak detection. The peaks detection algorithms are imported from the py-ecg-detectors module (Porr & Howell, 2019), and the method can be selected via the ecg_method parameter among the following: hamilton, christov, engelse-zeelenberg, pan-tompkins, wavelet-transform, moving-average. In this tutorial, we will use the pan-tompkins algorithm.
signal = ecg_df[(ecg_df.time > 500) & (ecg_df.time < 530)].ecg.to_numpy() # Select the first minute of recording
signal, peaks = ecg_peaks(signal, method='pan-tompkins', sfreq=1000)
This function will output the resampled signal (only if the sampling frequency of the input signal was different from 1000 Hz), and a peaks vector. The peaks vector has the same size than the input signal and is a boolean array (only contain 0/False and 1/True). The R peaks are encoded as 1 and the rest is set to 0. This vector can then be used to plot the detected R peaks on the input signal (panel 1 below). We can also use this vector to compute the distance between each R peaks (the R-R interval, see panel 2 below), which is used to measure the instantaneous heart rate (see panel 3 below).
time = np.arange(0, len(signal))/1000
fig, axs = plt.subplots(3, 1, figsize=(18, 9), sharex=True)
axs[0].plot(time, signal, color='#c44e52')
axs[0].scatter(time[peaks], signal[peaks], color='gray')
axs[0].set_ylabel('ECG (mV)')
axs[1].plot(time, peaks, color='g')
axs[1].set_ylabel('R peaks')
axs[2].plot(time[peaks][1:], np.diff(np.where(peaks)[0]), color='#4c72b0', linestyle='--')
axs[2].scatter(time[peaks][1:], np.diff(np.where(peaks)[0]), color='#4c72b0')
axs[2].set_xlabel('Time (s)')
axs[2].set_ylabel('R-R interval (ms)')
sns.despine()
The R-R intervals are expressed in milliseconds, but can also be converted into beats per minutes (BPM) using the following formula: $\(BPM = \frac{60000}{RR_{Intervals}} \)\( It is worth noting that the conversion from RR intervals to beats per minute involves a non-linear function of the form \)\frac{1}{x}$. This should be taken into account when interpreting the amplitude of heart rate increase/decrease with different baseline. For example, increasing the heart rate from 40 to 45 bpm diminish RR interval by approximately 167ms, while increasing the heart rate from 120 to 125 bpm decrease RR intervals by only 20ms (see graph below). While these differences have only marginal influence in most of the circumstances, this should be taken into account when measuring heart rate variability based on absolute RR interval difference, which is the case of some time-domain metric (e.g. the pNN50 or pNN20).
rr = np.arange(300, 2000, 2)
plt.figure(figsize=(8, 8))
plt.plot(60000/rr, rr)
plt.plot([25, 40], [60000/40, 60000/40], 'gray')
plt.plot([25, 45], [60000/45, 60000/45], 'gray')
plt.plot([25, 120], [60000/120, 60000/120], 'gray')
plt.plot([25, 125], [60000/125, 60000/125], 'gray')
plt.xlabel('BPM')
plt.ylabel('RR intervals (ms)')
plt.title('Converting RR intervals to BPM')
sns.despine()
Time series visualization¶
We can use the interactive plot to visualize this signal and automatically detected R peaks using the plot_raw() function.
plot_raw(ecg_df[(ecg_df.time>500) & (ecg_df.time<550)], type='ecg', ecg_method='pan-tompkins')
Photoplethysmography (PPG)¶
Photoplethysmography is a non-invasive method used to measure the change of blood volume. The PPG signal is characterized by a main systolic peak, often (but not always) followed by a smaller diastolic peaks before the signal return to origin. The lower point between the systolic and diastolic peak is the dicrotic notch. The systolic peaks correspond to the moment where the volume of blood in the blood vessel suddenly increase due to the pressure of the heart contraction. The blood volume measured at the periphery does not change immediately after the cardiac systole, but rather with a delay varying depending on physiological parameters. For this reason, the systolic peak and the R peak are not concomitant but rather delayed, the systolic peak often occurring the T wave of the ECG. The delay between the R wave on the ECG and the systolic peak can vary between individuals and across time in the same individual.
Image(url='https://github.com/embodied-computation-group/systole/raw/dev/notebooks/images/pulseOximeter.png', width=1200)

First, we import an example signal. This time serie represent a PPG recording from pulse oximeter in a young health participant. The sampling frequecy is 75 Hz (75 data points/seconds)
ppg = import_ppg()
Systolic peak detection¶
The main information of interest we can retrieve from the PPG signal is the timing of occurrence of the systolic peak. The timing of these events is indeed tightly linked to the occurrence of the R wave (although with a slightly variable delay), and we can use the peak vector resulting from this analysis the same way we analyse the RR interval time-series with an ECG signal. Because we are not measuring the heart rate strictly speaking, but the pulse at the periphery of the body, this approach is often termed pulse rate variability to distinguish from the heart rate variability that builds on the ECG signal.
Image(url='https://github.com/embodied-computation-group/systole/raw/dev/notebooks/images/ppgRecording.png', width=1200)

As before, you can plot the time serie and visualize peak detection and the inter beat interval time series using the plot_raw function.
plot_raw(ppg)